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Abstract 

It is very important to understand urban mobility patterns because most trips are concentrated 
in urban areas. In the paper, a new model is proposed to model collective human mobility in 
urban areas. The model can be applied to predict individual flows not only in intra-city but also 
in countries or a larger range. Based on the model, it can be concluded that the exponential 
law of distance distribution is attributed to decreasing exponentially of average density of human 
travel demands. Since the distribution of human travel demands only depends on urban planning, 
population distribution, regional functions and so on, it illustrates that these inherent properties 
of cities are impetus to drive collective human movements. 
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Understanding human movement patterns is considered as a long-term challenging work 
for a long time. It is very crucial to urban planning J2H1 [2E1 12H] , epidemics spreading [HEIES] 
and traffic engineering [7J [TJl [21] • During the past few years, various mobile devices (e.g. 
cellphones and GPS navigators) that support geolocation have been widely used in our daily 
life. As proxies, these devices record massive amounts of individual tracks. Benefited from 
it, the research of human mobility has attracted more and more attention of scientists from 
multiple disciplines such as physics, computer science and biology. 

In recent studies, Brockmann et al. [I] discovered human travel displacements can be 
described by a power-law distribution by investigating the dispersal of bank notes in the 
United States. Gonzalez et al. [8] studied mobility patterns of mobile phone users in 
European countries and found that their travel distances are distributed according to a 
truncated power-law. Moreover, the similar scaling law was also observed in [22] and [11] 
separately. Therefore, in order to understand the cause of the scaling law, some researchers 
tried to propose possible explanations from individual movement viewpoint [HI [TDJ - It is 
worthy to note that these researches characterized human travel occurred in large scale of 
space, including trips from countries to countries or cities to cities. 

Furthermore, there are also many studies which focus on human movement in urban 
areas. For example, trajectories of passengers by taxis were investigated separately in three 
cities: Lisbon [21], Beijing [TB] and Shanghai [15] . And the three studies all suggested that 
trip distances obey exponential distributions rather than power-law ones. Bazzani et al. 
[3] analyzed daily round-trip lengths of private cars' drivers in Florence and revealed an 
exponential law of lengths too. In addition, the distances of individuals' movement in the 
London subway were found obviously deviating from the power-law distribution as well [19] . 
A more convincing evidence is that exponential distributions of intra-urban travel distances 
were demonstrated respectively in eight cities of Northeast China by analyzing the mobile 
phone data [H] , which was not restricted to means of transportation. Even though a lot of 
empirical studies, the understanding of intra-urban mobility is still limited and there are no 
reasonable model to account for the exponential law to the best of our knowledge. 

In order to understand the exponential law of collective human mobility in urban areas, it 
is essential to model individual flows from one region to the other in a city. As we know, the 
gravity model [2] has already been applied widely to predict flows, including human travel 
[U [12], cargo ship movement [13] and telephone communications [15]. Assuming is the 
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flux of individual between location i (with population Pi) and location j (with population 
Pj) and is the distance between the two locations, a general gravity law [21J can be 
represented by 

papP 

where K, a and /3 are tunable parameters and f(dij) is often selected as a power or expo- 
nential function of dij. Especially, the gravity model with a = = 1 can be derived from 
entropy maximization [27] . Despite the prevailing gravity model, it still has some disadvan- 
tages [21] . Particularly, the gravity model is incompetent to explain the discrepancy of the 
numbers of individual flows in both directions between a pair of locations. Consequently, 
Simini et al. [21 J put forward the radiation model without parameters: 

{Tij) = V, • iw'' • i) • /-) (2) 

where (Ty) is the expected flux from i to j, Tj is the number of trips started from i and 
Pij is total population of locations (except % and j) from which to % the distance is less than 
or equal to dy. The model can predict population movement between cities or countries 
successfully, but it is not clear whether the model applies to intra-urban movement as well. 

Therefore, by exploring human travels by taxis in Beijing, it is aimed to figure out the 
answers to the following questions in the paper: 

• Whether can the radiation model predict intra-urban human flows? If not, how to 
model human mobility in urban areas. 

• What is the origin of the exponential law in intra-urban human mobility? Why do 
the distributions of travel distances in urban areas disagree with the ones at a larger 
scale? What is the inherent impetus to drive collective human movement? 



I. EMPIRICAL ANALYSIS OF TAXIS' GPS DATA 



A. Data description 

Here, we use the taxis' GPS data generated by over 10000 taxis in Beijing, China, during 
three months ended on Dec. 31st, 2010 [16]. From taxis' locations and statuses of occupation 
(with passengers or without passengers), trajectories of passengers can be observed. In order 
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to study intra-urban human mobility patterns, a total of 12009383 individuals' tracks were 
collected which occurred inside the 6th Ring Road of Beijing. 

For the purpose of investigating individual flows between regions in urban areas, the 
urban area in a map can be divided into discrete grid-like cells of size s x s (selection of 
s will be described below). The number of cells is N and the Euclidean distance between 
the centers of cell i and j is defined as d^. Therefore, pick-up points (origins) and drop-off 
points (destinations) of trajectories can be simplified to cells which they lie in. As illustrated 
in Fig. [Tj an individual trip can be represented by a tuple (lo, Id, to, Id), where lo and Ijj 
are the origin and destination cells (lo, ijj 6 {1,2,..., N}), to and to are the departure and 
arrival times. 
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FIG. 1: Illustration of the grid division on a map. The cell size is s x s. The solid line 

shows an individual trajectory from cell i to j, which can be denoted by the tuple 
(i, j,to,tD)- And the distance dij between cell % and j is described by the dashed line. 



B. Choice of cell size 

After dividing the urban areas into lattices, a trip length can be approximated by the 
distance between centers of cells which the pick-up and drop-off points lie in. In fact, it can 
be imagined that the location errors will become more and more large with increasing the cell 
size. At the same time, too small cell size is also undesirable because it would not reflect the 
regular mobility patterns between different regions obviously and increase computing costs. 
Therefore, it is important to choose an appropriate cell size to model urban mobility better. 



As shown in Fig. 2a, when cell size is larger than 0.01 degree, there is some deviation from 



the real distribution of trip displacements. So the cell size s is determined as 0.01 degree in 



the following paper. 

For the cell size s is 0.01°, the probability distribution of approximated distances is 



plotted in Fig. 2b Bescause the fraction of trips whose distances are less than 20km can 
reach nearly 98%, the curve can be fitted very well by an exponential function y = Ce~ Xx 
with parameter A = 0.230. 
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FIG. 2: Distributions of trip distances, (a) The comparison of distributions between 
approximated distances in different cell sizes and actual distances, (b) The distribution of 
approximated distances when cell size s is 0.01°. 



C. Geographic distributions of origins and destinations 

Considering pick-up and drop-off points of human trajectories respectively, the probability 
density distributions of them for three different months are obtained by Gaussian kernel 
density estimation (GKDE) method, which are visualized in Fig. [3] From the figure, it 
can be seen that density maps of origins/destinations for the three months have similar hot 
spots and these hot spots, such as Zhongguancun, Xidan, Beijing West Railway Station and 
so on, accord with our intuition very well. 

In order to quantify the similarities among geographic distributions of origins and des- 
tinations for the three months, we assign the probability for each cell as the fraction of 
origins/destinations falling into the grid cell. Actually, the probability distribution defined 
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(a) 2010.10 (Origins) 



(b) 2010.11 (Origins) 
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(d) 2010.10 (Destinations) (e) 2010.11 (Destinations) (f) 2010.12 (Destinations) 

FIG. 3: Gaussian kernel density estimations of origins and destinations for three months. 



reflect spatial distribution patterns of human traveling intensity directly. After calculating 
the discrete probability distributions of origins/destinations for the three months, the sim- 
ilarity between distributions can be measured by a cosine value. To be specific, assuming 
two discrete probability distributions {pi} and {qi} (i = 1,...,N), the similarity Sim cos 
between them is defined by 

q ■ _ Z^i=l PiQi 

oim cos — i = . =. 

So, the similarity is assigned to a value between and 1. The nearer the value approaches 
one, the more similar the two distributions are. The similarities between all distributions 
of origins/destinations for the three months are shown in Table [IJ It is noticed that most 
values of similarity are larger than 0.95, which indicates that distributions, whether between 
origins and destinations or between different months, all resemble each other and follow 
similar patterns. In addition, it demonstrates that collective travel behaviors in different 
regions of urban areas are stable over time. 
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TABLE I: The cosine similarities among distributions of origins/destinations for three 

months 



201010(0) 201011(0) 201012(0) 201010(D) 201011(D) 201012(D) 



201010(0) 


1.0 


0.996 


0.995 


0.951 


0.957 


0.961 


201011(0) 


0.996 


1.0 


0.999 


0.945 


0.958 


0.963 


201012(0) 


0.995 


0.999 


1.0 


0.943 


0.956 


0.963 


201010(D) 


0.951 


0.945 


0.943 


1.0 


0.996 


0.994 


201011(D) 


0.957 


0.958 


0.956 


0.996 


1.0 


0.999 


201012(D) 


0.961 


0.963 


0.963 


0.994 


0.999 


1.0 



But it must be noted that geographic distributions of origins/destinations considered 
here are only based on human travels by taxis. It is not clear whether there is obvious 
bias with all intra- urban movement by different kinds of transport, including private cars, 
buses, subways, and taxis. Thus, the geo-tagged Sina Weibo data during 4 weeks in 2012 are 
collected for comparison (The data description is given in appendix [AJ . From the geographic 
locations of posts, movements of Weibo users in Beijing are observed. Like the dataset of 
taxis, it also can characterize human trails in different geographic regions, but is independent 
of means of transportation. Then the geographic distributions of passengers' destinations 
and geo-tagged microblogs' locations are compared, which is illustrated in Fig. |4j From the 
graph, it can be seen that hot spots reflected by the two datasets are similar especially within 
the Fourth Ring Road of the city. The main difference between them is that the spectrum 
of human travel in Sina Weibo dataset is larger than one in taxis dataset. It is mainly 
because the urban planning of Beijing has been extending outward for the two years. And 
the cosine similarity between these two distributions is 0.829, which is high and validates 
our impressions further. Because of the similar distribution patterns of destinations and 
microblogs' locations described by the two different datasets, it can be conjectured that 
means of transportation has only a very small impact on the geographic distribution of 
origins/destinations of human travel in urban areas. 

Note that some studies are able to infer land use and regional functions successfully 
through analyzing spatiotemporal variation of human movement captured from passen- 
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Longitude +1.16e2 Longitude +1.16e2 

(a) Taxis(2010.10. 1-2010. 12.31) (b) Sina Weibo(2012. 10.8-2012. 11.4) 

FIG. 4: The geographic distributions for the two datasets. 

gers by taxis [29] or cellphone users [23 j - In fact, because geographic distribution of ori- 
gins/destinations is influenced by demands for mobility, these all demonstrate that the 
distribution patterns of traveling demands or intensity are mainly related to inherent prop- 
erties of the city, such as urban planning, regional functions, population density and so on, 
rather than means of transport. 

II. MODELING INTRA-URBAN HUMAN MOBILITY 

In the section, it is aimed to explore urban flows using tracks of passengers by taxis. 
Because the urban area has been divided into grid cells, we try to model and predict traffic 
flows between these grid cells. 

A. Radiation model 

Recently, the radiation model [21], which is parameter-free and only requires information 
of population distribution, was proposed to characterize mobility patterns between cities or 
countries. It overcomes some shortcomings of the gravity model and predicts traffic flows 
more precisely. So it is wished to verify whether the radiation model can be utilized to 
simulate collective human travel in urban areas of the city. But in urban areas, it is difficult 
to obtain population distribution directly because of high mobility of people. Instead we use 
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the distribution of destinations in the simulation. It is more reasonable because the number 
of destinations in a grid cell describes human travel intensity in the region. 




Trips(data) Distance, (((km) 

(a) Prediction (radiation model) (b) Probability Distribution of trip length 



FIG. 5: Simulation by the radiation model, (a) Prediction of traffic flows between grid 
cells. Grey points stand for the relationship between actual and predicted flux of each pair 
of grid cells. The red line y = x stands for the actual values equal with predicted values. 
The black circles denote mean values of prediction in the bins. The ends of whisker 
represent the 9th and 91st percentile in the bin. (b) Probability distributions of actual and 

predicted trip length. 



The results of simulation by the radiation model is shown in Fig. |5j From the figure, it 
seems that the predicted flux has a large deviation from the actual ones and the model un- 
derestimates the probability of trips with distances larger than 1km. There are two possible 
reasons: one is that the destinations distribution may have some bias with actual popula- 
tion distribution; the other is that, unlike trips between countries or cities, the population 
distribution may be only one of factors to influence human movement because people often 
move frequently for various purposes in urban areas. Therefore, it is necessary to consider 
a new model to understand intra-urban human mobility patterns. 
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B. Our model 



As for trips occurred during a period of time, it can be calculated that how many of them 
had left from or arrived at each cell. So the probability that people leave from/arrive at 
each cell is defined as follows 

# of trips leaving from the cell i 
Po[l) = # of trips ' 

_ ... # of trips arriving at the cell i r . 

Pd{i) = #^nrip^ '* = u 2 ' ••••">■ 

Actually, Pq and Pjj correspond to the distributions of origins and destinations separately. 



As demonstrated in subsection I C it must be noted that Pq and Pd only depend on 
population distribution, urban planning (land-use and transportation planning) and other 
environmental factors. The cells having large Po or Pp often lie in prosperous commer- 
cial/entertainment regions, developed residential areas, transport hubs and so on. 

Intuitively, the probability of a trip's occurrence has positive correlations with Po of 
origin cell and Pjj of destination cell, but has a negative correlation with the Euclidean 
distance between the two cells. Hence, in our model the probability of a trip reaching cell 
j, conditioned on starting from cell i, is defined as follows 

PdU) 



P(j\i) oc 



where f(d) is a function of distance between cells. In the paper, two frequently used forms 
of f(d) are given by 

Power — law : f(d) = d a 
Exponential : f(d) = e Xd 

where a and A are parameters whose values rely on the specific system and reflect the effect 
of distance on human travel. So the happening probability of a trip from cell i to j can be 
derived as 

/>(/•->./) I'oinrun 

_ P (i) Pp{j)/f{di 3 ) , . 

" o{) E^Pn(k)/f(d lk y (3) 
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And the expected number of trips from cell % to j can be obtained 



(Tij) = TP{t j) 

T Po(i)Pp(j) 

T Po(i)P D (j) 



(4) 



M(i) /(,/;,) • 

where M(z) = -Pd(^)/ f(dik) an d ^ is the total number of trips. 

Note that it can be concluded that Po ~ Pd because the cosine similarity between 
distributions of origins and destinations is greater than 0.95 in taxis dataset. Furthermore, 
when considering human travel during a long time, it is reasonable to assume Po = Pd due 
to round-trip patterns in urban areas. As described in the gravity model, the number of 
trips from cell i to j is equal with the one from cell j to i. However, that is not the case in 
our model because the values of M(i) and M(j) depend on locations of % and j respectively, 
which are often not equal to each other. Therefore, it is more consistent with our intuition. 

Subsequently, we apply our model to simulate trips by taxis in urban areas. The method 
of Maximum Likelihood Estimation (MLE) is used to evaluate the parameters in our model 
(see the appendix IB] for details). As for the two forms of function f(d), the parameters a 



and A of our model are calculated as 1.601 and 0.308 respectively. The Fig. 6a and 6b 
corresponding to the two forms of f(d) in our model, both describe comparisons of distance 
distributions between simulated trips and actual ones. It can be seen that the distance 
distribution of trips predicted by our model with the form of power-law can accord with the 
actual ones very well and instead our model with the form of exponential underestimates the 
amount of trips with long distance. Furthermore, actual and predicted traffic flows between 



pairs of cells by the two forms of our model are shown in Fig. 6c and 6d From the figures, 
it can be observed that the red line y = x lies between the 9th and the 91st percentiles 
in bins in our model with the form of power-law, indicating that the model can predict 
the number of trips between cells accurately. But our model with the form of exponential 
may underestimate traffic flows. In summary, our model with the form of power-law can 
be treated as an appropriate model to predict traffic flows in urban areas. Thus, in the 
following paper, we only use our model with the form of power- law by default. 

As demonstrated before, our model can be suitable to simulate human travels by taxis 
in urban areas. Whether is our model also applied to model collective human movement at 

11 





Trips(data) 




Trips(data) 



(c) Prediction by our model (power-law) (d) Prediction by our model (exponential) 

FIG. 6: Simulations by the two forms of our model, (a-b) Distributions of trip length 
simulated by our model, (c-d) Prediction of traffic flows between cells by our model. 



large scale? Here, the dataset of US commuting is used, which described US commuting 
between United States countries in 2000 [21]. As for our model, Po and Pd shown in formula 
(|3j [1]) are replaced by the fractions of population in countries. The function f(d) takes the 
form of power-law and d is the distance between countries. So by using the method of MLE, 
it can be confirmed that the value of a is 3.077. In Fig. [7j the results of predicting the 
commuting number between pairs of counties by radiation model and our model are shown 
separately. Both models can predict the mobility patterns very well because both red lines 
almost fall between the 9th and 91st percentiles in each bin. In addition, observed from the 
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gray points in the figure, the results predicted by our model seem more compact and have 
less fluctuations. 




Trips(data) Trips(data) 



(a) Radiation model (b) Our model 

FIG. 7: Predictions on US commuting. 

As a result, it is concluded that our model is very flexible and can simulate human 
movement not only in urban areas but also in countries. 

III. ANALYSIS OF DISTANCE DISTRIBUTIONS 

In our model, Po and Pp, corresponding to distributions of origins and destinations, are 
probabilities of individuals' leaving from and arriving at cells. For a whole day or longer 
time, both are almost equal and only depend on inherent properties of the city. So it is 
assumed that 

Po(i) = Pn(i) = P(i) (i = l,...,N), 

where P(-) reflects human travel demands in grid cell regions. 

Supporting the probability P(-) is uniform, the trips are simulated based on our model 
with two different values of a respectively (1.6, 2.4). As shown in Fig. [8| it can be seen that 
the distance distributions of trips in the two simulations accord to power-law with exponen- 
tial cutoff very well. Actually, the exponential decay in the tail is caused by the geographic 
limits. The exponents of power-law in the two distributions are -0.713 for a = 1.6 (blue 
circles) and -1.486 for a = 2.4 (green triangles), which approach to the theoretical results 
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as demonstrated in appendix [C] and will become more and more close to it as increasing 
the number of cells N. At the same time, the simulation according to actual distribution 
of travel demands in urban area of Beijing is also shown in the graph (red crosses), which 
is very close to the distribution of actual trip length. Compared with the simulations of 
uniform distribution, the travel distances decrease more rapidly even though the same value 
of a. 
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FIG. 8: The simulations based on uniform distribution of human travel demands for 

different a. 



Then, considering the probability distribution of P(-) is the same as the actual one in 
Beijing, we only rearrange the probability values of grid cells randomly. As shown in Fig. 
9a the travels simulated on grid cells with randomized permuting probability values accord 
with power-laws in the heads and decay more slowly, compared with ones simulated on grid 
cells with actual travel demands. And in Fig. [9bl the two simulations based on actual 
travel demands reflected by taxis dataset and Sina Weibo dataset are shown respectively. 
It is illustrated that both distributions have similar trends in the head, but the tail of trips 
reflected by taxis dataset decreases more sharply. It is reasonable because, as described 
before, both geographic distributions of travel demands are similar near urban centers and 
the urban planning of Beijing expands during the two years. 

In conclusion, it suggests that not only the probability distribution of human travel 
demands, but also the layout of them is the fundamental element to account for collective 
human travel and determine the distribution of trip lengths. And they only depend on urban 
planning, population distribution and other properties of the city directly. 



14 



ltr 1 




Actual prob. 
Rand, rearrangement-1 
Rand, rearrangement-2 
Rand, rearrangement-3 



10 1 

Distance, </{km) 



10- 1 



10' 



10- 



' * 








"A 






v. 


• Actual prob. (taxis) 

a Actual prob. {Sina Weibo) 





(a) Rearrangements 



10 1 

Distance, (i{km) 

(b) Comparison 



FIG. 9: The simulations on different distributions, (a) Randomized rearrangements of 
actual probability distribution, (b) Comparison distributions of trip length based on actual 
probability distribution of taxis and Sina Weibo. 



As illustrated above, it is no coincidence that exponential law is discovered in urban areas 
of cities. Then it is aimed to explore the origin of exponential law emerged in urban areas. 
Considering five hot spots regions in urban areas: Beijing West Railway Station (BWRS), 
Xizhimen, Beijing South Railway Station (BSRS), Sanlitun and Zhongguancun, the average 
densities of destinations or geo-tagged posts' locations with distance from these regions are 



plotted in Fig. 10 From the Fig. 10a, the average densities for the five hot spots have 



similar trends and decay exponentially, in which the exponent of exponential is -0.256 and is 



not far from the value -0.230 observed from distance distribution shown in Fig. 2b Also in 



Fig. 10b, the densities can be fitted by an exponential with exponent -0.231 when distances 
lie between 10km and 20km and then decay more quickly. It is worth mentioning that these 
findings seem like Clark's [5J who first use the negative exponential function to describe 
urban population density. These illustrate that the distributions of destinations and geo- 
tagged posts' locations are very similar near centers of the city and these distributions may 
account for the exponential law in urban areas of cities. 

Therefore, assuming the density distributions are described by negative exponential func- 
tions for different exponents A, we simulate human travels based on our model with the 



parameter a = 1.601. As shown in Fig. [TTJ different exponential exponents A about 0.25, 

15 



10° 
10 5 
10" 
I 10 3 

<0 
Ol 

> 
< 

10 1 



10 u 

10- 1 




-0.256» s 



• BWRS 

A Xizhimen 

* BSRS 
Sanlitun 
■ Zhongguancun 



20 30 40 

Distance, d(km) 



(a) Taxis 



10 1 



BWRS 

Xizhimen 

BSRS 

Sanlitun 

Zhongguancun 



-0.231" 



6* * 



20 30 
Distance, </{km) 



(b) Sina Weibo 



FIG. 10: The average densities of destinations and geo-tagged microblogs with distance 
from five hot spots regions in taxis and Sina Weibo datasets. 



0.5 and 0.8 are considered. It is noticed that these distributions have obvious exponential 
decreasing trends, where the fitted exponents of tails of distributions are 0.200, 0.455 and 
0.808 separately. As proved in appendix[DJ when the density function is exponential, the trip 
distance distribution P(d) satisfies Cid 1-<T e~ Ad < P(d) < C 2 d 1 -' 7 e- Xd . So when d > 1/A, the 
exponential section dominates and P{d) decreases exponentially. The fitted exponents are 
very close to the parameter A of density function, which accords with our theoretical proof 
very well. In urban areas, the density function usually decreases significantly leading to a 
large exponent A. It must be noticed that the exponent of d is usually larger than -1 because 
of the small parameter a for urban areas, which is different from the observed power-law 
distributions of distance where the exponents are between -2 and -1. So it can explain the 
reason why human trip distance in urban areas accords with exponential distribution more 
better. 



IV. CONCLUSIONS AND FUTURE WORK 

In our daily life, most human activities, especially movements, are concentrated in urban 
areas of cities. So it is very important to understand intra-city mobility patterns. In the 
paper, we aim to study human mobility patterns in urban areas through taxis dataset in 
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FIG. 11: Simulations based on our model with average density distributions decreasing 

exponentially for different exponents. 

Beijing. 

Firstly, the geographic distributions of origins and destinations follow very similar pat- 
terns. And compared with geo-tagged microblogs' locations, they are also very close each 
other near centers of the city. It suggests that these distributions are irrelevant to means 
of transport of human travels and only depend on urban planning, population distribution 
and other properties of the city. 

Secondly, it seems that radiation model can not model collective human movement in 
urban areas very well. So we propose our model and observe the exponential law in the 
distribution of simulated trip lengths. Furthermore, our model may be appropriate for 
human travel not only in urban areas but in countries or larger ranges. 

Finally, based on our model, it can be found that the distribution of trip distances depends 
on geographic distribution of human travel demands, which is inherent nature of the city. 
Meanwhile, it is observed that average human movement intensities decay exponentially with 
distance from hotspots. It can explain the origin of exponential law discovered in actual trip 
length distribution. 

However, it must be emphasized that intra-urban mobility considered here occurred dur- 
ing a period of long time. In fact, the traffic flows between regions in urban areas is varied 
with the time of a day, which show strong periodic fluctuations. So in future, we will focus 
on the temporal characteristics of intra-urban individual flows to predict human mobility 
more precisely. 
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Appendix A: Geo-tagged dataset of Sina Weibo 



Microblogging services, such as Twitter and Weibo, have become more and more popular 
for users to share information with friends or followers. Recently, Weibo, Twitter and 
other online location-based services allow users to post their current geographic locations in 
messages. In Sina Weibo, when a user posts a geo-tagged microblog, it appears in a "public 
timeline" of recent location updates. So by using Weibo's geolocation API, we monitor the 
public timeline from Oct. 8, 2012 to Nov. 4, 2012 and only focus on the microblogs located 
in urban areas of Beijing. A total of 513315 geo-tagged posts are collected at first. After 
removing abnormal users and repeated microblogs posted by the same user in a relative 
short time, we finally obtain 491513 geo-tagged posts. 



Appendix B: MLE for our model 

Let us consider intra-urban trips during a period of time, which can be denoted as Tr = 
\ 1% ,tQ ,tp ')\r = 1, . . . , n}, where n is the number of trips. Supposing these trips are 
independent of each other, so the log probability of Tr can be calculated as follows 



71 



logP(Tr)=lo g nn4 r) ^4 r) ) 



r=l 



= ^ logP(z^j) 





Po(i)P D U)/f(d i3 ) 
Po(i)P D (j) 



Here, the Nelder-Mead simplex algorithm [T7] is used to find the minimum of 
and evaluate the parameter o or A in the function f(d) numerically. 
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- log P(Tr) 



Appendix C: Proof of uniform density distribution 



Supporting the probability P(-) is uniform (i.e. P(i) = 1/N, (i = 1, . . . , N)), the proba- 
bility with travel distance d simulated based on our model can be denoted as 

P(d)= £ P (^3) 

i,j:dij=d 



P_(j)/d° 
i j:dii=d ^k^i 



where dij represents the distance between cell i and j, and P(i) stands for the probability 
to select the cell i. 

When N is large, for different i, J2 k ^iP(k)/d^ k have approximately the same values. 
Therefore, 

P(d) ocN-^- d/d a oc d 1 " 7 
Appendix D: Proof of exponential density distribution 



A (x ,y ) 




I C (x 2 ,y 2 ) 



FIG. 12: Illustration of negative exponential density function. 



As shown in Fig. 12, the center is the point O and the density distribution is 

p(r) = Ce~ Xr (0<r<R), 

where r is the distance from the center O and R is the size of a city. By using our model, 
we can estimate the displacement distribution of human movement as follows: 
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d{A,B)=d 

Considering continuity of density distribution, it can also be written by 

r r [ma m r ,p(B)d- CT ds 



Let 



and 



Therefore, 



h = I p{B)d- a ds 

Js:d(A,B)=d 



[[ P (C)d(C,A)-°dS 2 . 

J JC^B 



h= Ce- x ^+^d- a ds 

J s:(xi-xo) 2 + (yi-yo) 2 =d 

(xi = xq + d cos a, yi — yo + d sin a) 

/•27T 

= / (7^ 1_<T g _A V :E o+S'o+ rf2 + 2c! ( :c OCOsa+j/osina) ( ^ Q , 
JO 



(x 2 = I cos /3, 1/2 = / sin/3) 
Jo Jo 



dx 2 dy 2 



(V 12 + x l + Vo - 2 K x o cos p + y sin p))° 



dl. 



P(d) can be represented as 



Where 



PW = // 



Ce- x ^+yl^dx dy 
(xo = r cos 9,yo = r sin 6) 

2tt r R tj 

Cre- Xr -dr 



PZTT pi 

= de 

Jo Jo 



U= f W d l- e -X^ +( P + 2drcos(e-a) d ^ 

Jo 

p2tt pR 

V = dp le~ xl (y/l 2 + r 2 - 2rl cos (P - 6))- a dl. 
Jo Jo 
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We notice that the denominator V has nothing to do with d, so consider the numerator 

U 

r 2ir 

U= ^l-T e -A v /r 2 +d 2 +2dr cos 

JO 



In a similar way, 



As a result, 



> / d 1 - <r e- A(r+d) da 
Jo 

= 2nd 1 - a e- Xd e- Xr . 



p2-K 

U< d^e'^-^da 
Jo 

r>2ir 

< I (I "< A,/ r! do 







= 2ird 1 - a e- Xd e Xr . 



dtf-'e-™ < P(d) < C 2 d 1 - a e- Xd 
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